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Score, Pseudo-Score and Residual 
Diagnostics for Spatial Point 
Process Models 



Cs| ■ Adrian Baddeley, Ege Rubak and Jesper M0ller 



Abstract. We develop new tools for formal inference and informal 
model validation in the analysis of spatial point pattern data. The 
score test is generalized to a "pseudo-score" test derived from Besag's 
pseudo-likelihood, and to a class of diagnostics based on point process 
residuals. The results lend theoretical support to the established prac- 
tice of using functional summary statistics, such as Ripley's if-function, 
when testing for complete spatial randomness; and they provide new 
tools such as the compensator of the ET-function for testing other fit- 
ted models. The results also support localization methods such as the 
scan statistic and smoothed residual plots. Software for computing the 
diagnostics is provided. 

Key words and phrases: Compensator, functional summary statistics, 
model validation, point process residuals, pseudo-likelihood. 



1. INTRODUCTION 

This paper develops new tools for formal infer- 
ence and informal model validation in the analysis 
of spatial point pattern data. The score test statistic, 
based on the point process likelihood, is generalized 
to a "pseudo-score" test statistic derived from Be- 
sag's pseudo-likelihood. The score and pseudo-score 
can be viewed as residuals, and further generalized 
to a class of residual diagnostics. 
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The likelihood score and the score test [61, 75], 
[22], pages 315 and 324, are used frequently in ap- 
plied statistics to provide diagnostics for model se- 
lection and model validation [2, 15, 19, 60, 77]. In 
spatial statistics, the score test has been used mainly 
to support formal inference about covariate effects 
[13, 47, 76] assuming the underlying point process 
is Poisson under both the null and alternative hy- 
potheses. Our approach extends this to a much wider 
class of point processes, making it possible (for ex- 
ample) to check for covariate effects or localized hot- 
spots in a clustered point pattern. 

Figure 1 shows three example data sets studied in 
the paper. Our techniques make it possible to check 
separately for "inhomogeneity" (spatial variation in 
abundance of points) and "interaction" (localized 
dependence between points) in these data. 

Our approach also provides theoretical support 
for the established practice of using functional sum- 
mary statistics such as Ripley's iT-function [63, 64] 
to study clustering and inhibition between points. In 
one class of models, the score test statistic is equiv- 
alent to the empirical iT-function, and the score 
test procedure is closely related to the customary 
goodness-of-fit procedure based on comparing the 
empirical i^-function with its null expected value. 
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(a) (b) (c) 

Fig. 1. Point pattern data sets, (a) Japanese black pine seedlings and saplings in a 10 x 10 metre quadrat [53, 54]- Reprinted 
by kind permission of Professors M. Numata and Y. Ogata, (b) Simulated realization of inhomogeneous Strauss process showing 
strong inhibition and spatial trend [7], Figure ^b. (c) Simulated realization of homogeneous Geyer saturation process showing 
moderately strong clustering without spatial trend [7], Figure ^c. 



Similar statements apply to the nearest neighbor 
distance distribution function G and the empty space 
function F. 

For computational efficiency, especially in large 
data sets, the point process likelihood is often re- 
placed by Besag's [14] pseudo-likelihood. The result- 
ing "pseudo-score" is a possible surrogate for the 
likelihood score in the score test. In one model, this 
pseudo-score test statistic is equivalent to a resid- 
ual version of the empirical iT-function, yielding 
a new, efficient diagnostic for model fit. However, in 
general, the interpretation of the pseudo-score test 
statistic is conceptually more complicated than that 
of the likelihood score test statistic, and hence diffi- 
cult to employ as a diagnostic. 

In classical settings the score test statistic is 
a weighted sum of residuals. For point processes 
the pseudo-score test statistic is a weighted point 
process residual in the sense of [4, 7]. This sug- 
gests a simplification, in which the pseudo-score test 
statistic is replaced by another residual diagnostic 
that is easier to interpret and to compute. 

In special cases this diagnostic is a residual version 
of one of the classical functional summary statis- 
tics K, G or F obtained by subtracting a "com- 
pensator" from the functional summary statistic. 
The compensator depends on the fitted model, and 
may also depend on the observed data. For exam- 
ple, suppose the fitted model is the homogeneous 
Poisson process. Then (ignoring some details) the 
compensator of the empirical ET-function K(r) is 
its expectation K$(r) = irr 2 under the model, while 
the compensator of the empirical nearest neighbor 
function G{r) is the empirical empty space func- 
tion F(r) for the same data. This approach pro- 



vides a new class of residual summary statistics that 
can be used as informal diagnostics for model fit, 
for a wide range of point process models, in close 
analogy with current practice. The diagnostics ap- 
ply under very general conditions, including the case 
of inhomogeneous point process models, where ex- 
ploratory methods are underdeveloped or inapplica- 
ble. For instance, Figure 2 shows the compensator 
of K(r) for an inhomogeneous Strauss process. 

Section 2 introduces basic definitions and assump- 
tions. Section 3 describes the score test for a gen- 
eral point process model, and Section 4 develops 
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Fig. 2. Empirical K -function (thick grey line) for the point 
pattern data in Figure 1(b), compensator of the K -function 
(solid black line) for a model of the correct form, and expected 
K -function for a homogeneous Poisson process (dashed line). 
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the important case of Poisson point process mod- 
els. Section 5 gives examples and technical tools for 
non-Poisson point process models. Section 6 devel- 
ops the general theory for our diagnostic tools. Sec- 
tion 7 applies these tools to tests for first order trend 
and hotspots. Sections 8-11 develop diagnostics for 
interaction between points, based on pairwise dis- 
tances, nearest neighbor distances and empty space 
distances, respectively. The tools are demonstrated 
on data in Sections 12-15. Further examples of di- 
agnostics are given in Appendix A. Appendices B-E 
provide technical details. 

2. ASSUMPTIONS 

2.1 Fundamentals 

A spatial point pattern data set is a finite set 
x = {x±, . . . , x n } of points Xi £ W, where the num- 
ber of points n(x) = n > is not fixed in advance, 
and the domain of observation W C R d is a fixed, 
known region of d-dimensional space with finite pos- 
itive volume \W\. We take d = 2, but the results 
generalize easily to all dimensions. 

A point process model assumes that x is a realiza- 
tion of a finite point process X in If without mul- 
tiple points. We can equivalently view X as a ran- 
dom finite subset of W. Much of the literature on 
spatial statistics assumes that X is the restriction 
X = Y n W of a stationary point process Y on the 
entire space M 2 . We do not assume this; there is no 
assumption of stationarity, and some of the models 
considered here are intrinsically confined to the do- 
main W. For further background material including 
measure theoretical details, see, for example, [50], 
Appendix B. 

Write X ~ Poisson(W, p) if X follows the Poisson 
process on W with intensity function p, where we as- 
sume v = J w p(u) du is finite. Then n(X) is Poisson 
distributed with mean v, and, conditional on n(X), 
the points in X are i.i.d. with density p(u)jv. 

Every point process model considered here is as- 
sumed to have a probability density with respect to 
Poisson(W, 1), the unit rate Poisson process, under 
one of the following scenarios. 

2.2 Unconditional Case 

In the unconditional case we assume X has a den- 
sity / with respect to Poisson(IT, 1). Then the den- 
sity is characterized by the property 

(1) E[h(X)]=E[h(Y)f(Y)] 

for all nonnegative measurable functionals h, where 
Y ~ Poisson (VF, 1). In particular, the density of 



Poisson (W, p) is 

(2) /(x) = exp^(l-pH)d^ Hp( Xi ). 

We assume that / is hereditary, that is, /(x) > 
implies /(y) > for all finite yCxClf. Processes 
satisfying these assumptions include (under integra- 
bility conditions) inhomogeneous Poisson processes 
with an intensity function, finite Gibbs processes 
contained in W, and Cox processes driven by ran- 
dom fields. See [38], Chapter 3, for an overview of 
finite point processes including these examples. In 
practice, our methods require the density to have 
a tractable form, and are only developed for Pois- 
son and Gibbs processes. 

2.3 Conditional Case 

In the conditional case, we assume X = Y n W 
where Y is a point process. Thus, X may depend 
on unobserved points of Y lying outside W. The 
density of X may be unknown or intractable. Under 
suitable conditions (explained in Section 5.4) mod- 
eling and inference can be based on the conditional 
distribution of X° = Xn W ° given X+ = Xn W+ = 
x + , where W + C W is a subregion, typically a re- 
gion near the boundary of W, and only the points in 
W° = W\ W + are treated as random. We assume 
that the conditional distribution of X° = X n W° 
given X + = X n W + = x + has an hereditary den- 
sity /(x°|x + ) with respect to Poisson (VF°, 1). Pro- 
cesses satisfying these assumptions include Markov 
point processes [74], [50], Section 6.4, together with 
all processes covered by the unconditional case. Our 
methods are only developed for Poisson and Markov 
point processes. 

For ease of exposition, we focus mainly on the un- 
conditional case, with occasional comments on the 
conditional case. For Poisson point process models, 
we always take W = W° so that the two cases agree. 

3. SCORE TEST FOR POINT PROCESSES 

In principle, any technique for likelihood-based in- 
ference is applicable to point process likelihoods. In 
practice, many likelihood computations require ex- 
tensive Monte Carlo simulation [31, 50, 51]. To min- 
imize such difficulties, when assessing the goodness 
of fit of a fitted point process model, it is natural to 
choose the score test which only requires computa- 
tions for the null hypothesis [61, 75]. 

Consider any parametric family of point process 
models for X with density fg indexed by a /c-dimen- 
sional vector parameter 9 £ G C For a simple 
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null hypothesis Hq :9 = 9q where 9q £ is fixed, the 
score test against any alternative Hi :9 G 0i, where 
01 C \ {#o}> is based on the score test statistic 
([22], page 315), 

- U(9q) y I(9q)- 1 U(9q). 



(3) 

Here U(9) = $log/ fl (x) and 1(9) = K e [U(9)U(9) T ] 
are the score function and Fisher information, re- 
spectively, and the expectation is with respect to fg. 
Here and throughout, we assume that the order of 
integration and differentation with respect to 9 can 
be interchanged. Under suitable conditions, the null 
distribution of T 2 is x 2 with k degrees of freedom. 
In the case k = 1 it may be informative to evaluate 
the signed square root 

(4) T = U(9 )/y/I(9oj, 

which is asymptotically -/V(0, 1) distributed under 
the same conditions. 

For a composite null hypothesis Hq : 9 S Go where 
©0 C is an m-dimensional submanifold with < 
m < k, the score test statistic is defined in [22], 
page 324. However, we shall not use this version 
of the score test, as it assumes differentiability of 
the likelihood with respect to nuisance parameters, 
which is not necessarily applicable here (as exempli- 
fied in Section 4.2). 

In the sequel we often consider models of the form 

(5) /(a,/3)( x ) = c(a, /3)h a (-x) exp((3S(x.)), 

where the parameter f3 and the statistic S(x) are one 
dimensional, and the null hypothesis is Hq : f3 = 0. 
For fixed a, this is a linear exponential family and (4) 
becomes 



T(a) = (5(x) -E (ai0) [5(X)])/^Vax (a>0) [5(X)]. 

In practice, when a is unknown, we replace a by 
its MLE under Hq so that, with a slight abuse of 
notation, the signed square root of the score test 
statistic is approximated by 

T = T(a) 

(6) 



= (5(x) -E {6i0) [5(X)])/^Var (Ai0) [5(X)]. 

Under suitable conditions, T in (6) is asymptotically 
equivalent to T in (4), and so a standard Normal 
approximation may still apply. 

4. SCORE TEST FOR POISSON PROCESSES 

Application of the score test to Poisson point pro- 
cess models appears to originate with Cox [21]. Con- 
sider a parametric family of Poisson processes, 
Poisson ( W, pg), where the intensity function is in- 



dexed by 9 S 0. The score test statistic is (3), where 

U(9) = y^Kg(xi) - Kg(u)p e (u)du, 



1(9) 



K e (u)no(u) po(u)du 



with Kg (u) = log pg (u) . Asymptotic results are giv- 
en in [45, 62]. 

4.1 Log-Linear Alternative 

The score test is commonly used in spatial epi- 
demiology to assess whether disease incidence de- 
pends on environmental exposure. As a particular 
case of (5), suppose the Poisson model has a log- 
linear intensity function 

(7) P(a,p)(u) = exp(a + pZ(u)), 

where Z(u),u £ W, is a known, real- valued and non- 
constant covariate function, and a and (5 are real 
parameters. Cox [21] noted that the uniformly most 
powerful test of Hq : (3 = (the homogeneous Pois- 
son process) against H\ : (3 > is based on the statis- 
tic 



(8) 



S(x) = ^Z( a 



Recall that, for a point process X on W with inten- 
sity function p, we have Campbell's Formula ([24], 
page 163), 

(9) E(V/i(xi)]=/ h(u)p(u)du 

for any Borel function h such that the integral on the 
right-hand side exists; and for the Poisson process 
Poisson(IU, p), 



(10) Yax(^h( Xi yj 



h(u) 2 p(u) du 



for any Borel function h such that the integral on 
the right-hand side exists. Hence, the standardized 
version of (8) is 

(11) T=(s(x)-k J Z(u)diij/Jk J Z(u) 2 du, 

where k = n/|W| is the MLE of the intensity k = 
exp(a) under the null hypothesis. This is a direct 
application of the approximation (6) of the signed 
square root of the score test statistic. 

Berman [13] proposed several tests and diagnos- 
tics for spatial association between a point process X 
and a covariate function Z(u). Berman's Z\ test is 
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equivalent to the Cox score test described above. 
Waller et al. [76] and Lawson [47] proposed tests for 
the dependence of disease incidence on environmen- 
tal exposure, based on data giving point locations of 
disease cases. These are also applications of the score 
test. Berman conditioned on the number of points 
when making inference. This is in accordance with 
the observation that the statistic n(x) is S-ancillary 
for /3, while <S(x) is S-sufficient for f3. 

4.2 Threshold Alternative and 
Nuisance Parameters 

Consider the Poisson process with an intensity 
function of "threshold" form, 



Pz,K,<f>(u) 



Kexp(</>) if Z(u) < z, 
k if Z(u) > z, 

where z is the threshold level. If z is fixed, this 
model is a special case of (7) with Z(u) replaced 
by \{Z(u) < z}, and so (8) is replaced by 



S*(x) = S(x, z) 



I{Z(x,)<z}, 



where !{•} denotes the indicator function. By (11) 
the (approximate) score test of Hq : 4> = against 
H\ : 7^ is based on 



T = T(z) = (5(x, z) - kA(z))/^/kA{z), 

where A(z) = \{u £ W : Z(u) < z}\ is the area of the 
corresponding level set of Z. 

If z is not fixed, then it plays the role of a nuisance 
parameter in the score test: the value of z affects 
inference about the canonical parameter (f>, which is 
the parameter of primary interest in the score test. 
Note that the likelihood is not differentiable with 
respect to z. 

In most applications of the score test, a nuisance 
parameter would be replaced by its MLE under the 
null hypothesis. However, in this context, z is not 
identifiable under the null hypothesis. Several solu- 
tions have been proposed [18, 25, 26, 33, 68]. They 
include replacing z by its MLE under the alterna- 
tive [18], maximizing T(z) or |T(z)| over z [25, 26], 
and finding the maximum p- value of T(z) or |T(z)| 
over a confidence region for z under the alterna- 
tive [68]. 

These approaches appear to be inapplicable to the 
current context. While the null distribution of T(z) 
is asymptotically N(0, 1) for each fixed zasK-> oo, 
this convergence is not uniform in z. The null distri- 
bution of S(x,z) is Poisson with parameter nA(z); 



sample paths of T{z) will be governed by Poisson 
behavior where A{z) is small. 

In this paper, our approach is simply to plot the 
score test statistic as a function of the nuisance pa- 
rameter. This turns the score test into a graphical 
exploratory tool, following the approach adopted in 
many other areas [2, 15, 19, 60, 77]. A second style 
of plot based on S(x, z) — kA{z) against z may be 
more appropriate visually. Such a plot is the lurking 
variable plot of [7]. Berman [13] also proposed a plot 
of 5(x, z) against z, together with a plot of hA(z) 
against diagnostic for dependence on Z. This 

is related to the Kolmogorov-Smirnov test since, un- 
der Ho, the values Yi = Zixi) are i.i.d. with distri- 
bution function P(Y <y) = A(y)/\W\. 

4.3 Hot Spot Alternative 

Consider the Poisson process with intensity 

(12) p K rf )V (u) = Kexp(<j)k(u-v)), 

where A; is a kernel (a probability density on M 2 ), 
k > and (f> are real parameters, and v £ ffi 2 is a nui- 
sance parameter. This process has a "hot spot" of 
elevated intensity in the vicinity of the location v. 
By (11) and (9)-(10) the score test of H : (f) = 
against Hi :(j)^0 is based on 



T = T(v) = (S(x i v)-kM 1 (y))/ \J kM 2 (v) , 
where 

S(x,v) = ^ k(xj - v) 

i 

is the usual nonparametric kernel estimate of point 
process intensity [28] evaluated at v without edge 
correction, and 



Mi(v) = I k(u-v) l du, i 
'w 



1,2. 



The numerator S(x,v) — kM\(v) is the smoothed 
residual field [7] of the null model. In the special 
case where k(u) oc < h} is the uniform density 

on a disc of radius h, the maximum m&x v T(v) is 
closely related to the scan statistic [1, 44]. 

5. NON-POISSON MODELS 

The remainder of the paper deals with the case 
where the alternative (and perhaps also the null) is 
not a Poisson process. Key examples are stated in 
Section 5.1. Non-Poisson models require additional 
tools including the Papangelou conditional intensity 
(Section 5.2) and pseudo-likelihood (Section 5.3). 
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5.1 Point Process Models with Interaction 

We shall frequently consider densities of the form 



(13) /(x) = c 



II A (^) 



exp((ftV(x)), 



where c is a normalizing constant, the first order 
term A is a nonnegative function, <j> is a real inter- 
action parameter, and V(x) is a real nonadditive 
function which specifies the interaction between the 
points. We refer to V as the interaction potential. 
In general, apart from the Poisson density (2) corre- 
sponding to the case (ft = 0, the normalizing constant 
is not expressible in closed form. 

Often the definition of V can be extended to all 
finite point patterns in M 2 so as to be invariant un- 
der rigid motions (translations and rotations). Then 
the model for X is said to be homogeneous if A is 
constant on W, and inhomogeneous otherwise. 

Let 



d(u,x) 



mm | 

j 



U — Xi 



denote the distance from a location u to its nearest 
neighbor in the point configuration x. For n(x) = 
n > 1 and i = 1, . . . ,n, define 

X_j =x \ {Xi}. 

In many places in this paper we consider the fol- 
lowing three motion-invariant interaction potentials 
V(x) = V(x, r) depending on a parameter r > 
which specifies the range of interaction. The Strauss 
process [73] has interaction potential 



(14) 



Vs(x,r) 



i<j 



H\\xi 



<r}, 



the number of r-close pairs of points in x; the Geyer 
saturation model [31] with saturation threshold 1 
has interaction potential 



(15) 



V G (x,r) 



I{d(xi,x^) < r}, 



the number of points in x whose nearest neighbor is 
closer than r units; and the Widom-Rowlinson pen- 
etrable sphere model [78] or area-interaction pro- 
cess [11] has interaction potential 



(16) 



Vk(x,r) 



wn[jB\ 



Xi,r, 



the negative area of W intersected with the union 
of balls B(xi,r) of radius r centered at the points 
of x. Each of these densities favors spatial clustering 
(positive association) when (ft > and spatial inhibi- 



tion (negative association) when (ft < 0. The Geyer 
and area-interaction models are well-defined point 
processes for any value of (ft [11, 31], but the Strauss 
density is integrable only when (ft < [43] . 

5.2 Conditional Intensity 

Consider a parametric model for a point process X 
in M 2 , with parameter 6 £ @. Papangelou [59] de- 
fined the conditional intensity of X as a nonnega- 
tive stochastic process \g(u, X) indexed by locations 
u G M 2 and characterized by the property that 



(17) 



£ h( Xl ,X\{ Xl }) 



h(u, X)A#(u, X) du 



for all measurable functions h such that the left 
or right-hand side exists. Equation (17) is known 
as the Georgii-Nguyen-Zessin (GNZ) formula [[30, 
41], [42, 52]]; see also Section 6.4.1 in [50]. Adapting 
a term from stochastic process theory, we will call 
the random integral on the right-hand side of (17) 
the (Papangelou) compensator of the random sum 
on the left-hand side. 

Consider a finite point process X in W. In the 
unconditional case (Section 2.2) we assume X has 
density fg(x) which is hereditary for all 9 G 0. We 
may simply define 



(18) 



\ e (u,x) = f e (xU{u})/f e (x) 



for all locations u£W and point configurations x C 
W such that u^x. Here we take 0/0 = 0. For Xi G x 
we set \g(xi,x) = Ag(xi,x_i), and for u ^ W we 
set Xg(u,x) = 0. Then it may be verified directly 
from (1) that (17) holds, so that (18) is the Papan- 
gelou conditional intensity of X. Note that the nor- 
malizing constant of fg cancels in (18). For a Poisson 
process, it follows from (2) and (18) that the Pa- 
pangelou conditional intensity is equivalent to the 
intensity function of the process. 

In the conditional case (Section 2.3) we assume 
that the conditional distribution of X° = Xfl W° 
given X + = Xfl W + = x + has a hereditary den- 
sity /5»(x°|x + ) with respect to Poisson(iy°, 1), for 
all 9 G 0. Then define 

. / e (x°U{u}|x+) 



(19) 



Xg(u,x° 



' / e (x°\{u}|x+) 

if u G W°, and zero otherwise. It can similarly be 
verified that this is the Papangelou conditional in- 
tensity of the conditional distribution of X° given 



X 



x 
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It is convenient to rewrite (18) in the form 
X g {u, x) = exp(A u log /(x)), 
where A is the one-point difference operator 

(20) A u h(x) = h(xU{u})-h(x\{u}). 

Note the Poincare inequality for the Poisson pro- 
cess X, 

(21) Var[/i(X)] < E f [A u h(X)] 2 p(u) du 

Jw 

holding for all measurable functionals h such that 
the right-hand side is finite; see [46, 79]. 

5.3 Pseudo-Likelihood and Pseudo-Score 

To avoid computational problems with point pro- 
cess likelihoods, Besag [14] introduced the pseudo- 
likelihood function 



PL(0) 



~[\f)(Xi,X.) 



(22) 



exp 



Ae(it,x) du 



This is of the same functional form as the likelihood 
function of a Poisson process (2), but has the Papan- 
gelou conditional intensity in place of the Poisson 
intensity. The corresponding pseudo-score 

PU(0) = ^logPL(0) 
(23) 86 

' — logA e (xi,x) - J — Xg(u,x) du 



is an unbiased estimating function, EgPU(#) = 0, by 
virtue of (17). In practice, the pseudo-likelihood is 
applicable only if the Papangelou conditional inten- 
sity Xq(u,x) is tractable. 

The pseudo-likelihood function can also be defined 
in the conditional case [39]. In (22) the product is 
instead over points %i G x° and the integral is in- 
stead over W°; in (23) the sum is instead over points 
x,i € x° and the integral is instead over W°; and 
in both places x = x° U x + . The Papangelou con- 
ditional intensity Xg(u,x) must also be replaced by 
A e (u,x°|x+). 

5.4 Markov Point Processes 

For a point process X constructed as X = Y n W 
where Y is a point process in IR 2 , the density and 
Papangelou conditional intensity of X may not be 
available in simple form. Progress can be made if Y 
is a Markov point process of interaction range R < 
oo; see [30, 52, 66, 74] and [50], Section 6.4.1. Briefly, 
this means that the Papangelou conditional inten- 



sity Xg(u,Y) of Y satisfies Xg(u,Y) = Xg(u,Y n 
B(u, R)), where B(u, R) is the ball of radius R cen- 
tered at u. Define the erosion of W by distance R, 

W eR = {u£W:B(u,R)cW}, 

and assume this has nonzero area. Let B = W\ Wqr 
be the border region. The process satisfies a spatial 
Markov property: the processes Y n Wqr and Y n 
W c are conditionally independent given YflB. 

In this situation we shall invoke the conditional 
case with W° = W QR and W+ = W\W°. The con- 
ditional distribution of X n W° given X n W + = x + 
has Papangelou conditional intensity 

A e (u,x°Ux+) iiueW°, 
otherwise. 
Thus, the unconditional and conditional versions of 
a Markov point process have the same Papangelou 
conditional intensity at locations in W°. 

For x° = {x±, . . . , x„o}, the conditional probability 
density given x + becomes 

Mx°|x + ) 



(24) A e (u,x°|x H 



[x + )Xg(x 1 ,X°)Y\_Xe(x i , {Xl, . . . ,SEi_l} U x + ) 



i=2 



if n° > 0, and fg(0\x + ) = cg(x. + ), where denotes 
the empty configuration, and the inverse normaliz- 
ing constant cg(x. + ) depends only on x + . 
For example, instead of (13) we now consider 



/(x°|x + )=c(x^ 



exp(^V(x°Ux + )), 



assuming V(y) is defined for all finite yd 2 such 
that for any u G M 2 \ y, A u V(y) depends only on u 
and y D B{u,R). This condition is satisfied by the 
interaction potentials (14)-(16); note that the range 
of interaction is R = r for the Strauss process, and 
R = 2r for both the Geyer and the area-interaction 
models. 

6. SCORE, PSEUDO-SCORE AND 
RESIDUAL DIAGNOSTICS 

This section develops the general theory for our 
diagnostic tools. 

By (6) in Section 3 it is clear that comparison 
of a summary statistic S*(x) to its predicted va- 
lue ES'(X) under a null model is effectively equiv- 
alent to the score test under an exponential family 
model where S(x) is the canonical sufficient statis- 
tic. Similarly, the use of a functional summary statis- 
tic S(x,z), depending on a function argument z, is 
related to the score test under an exponential family 
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model where z is a nuisance parameter and S(x,z) 
is the canonical sufficient statistic for fixed z. In this 
section we construct the corresponding exponential 
family models, apply the score test, and propose sur- 
rogates for the score test statistic. 

6.1 Models 

Let /e(x) be the density of any point process X 
on W governed by a parameter 9. Let <S(x, z) be 
a functional summary statistic of the point pattern 
data set x, with function argument z belonging to 
any space. 

Consider the extended model with density 

(25) /e,0, z (x) =ce, < j,, z fo(x)exp((j)S(x.,z)), 

where <f> is a real parameter, and cg^^ is the normal- 
izing constant. The density is well-defined provided 

M(9, <f>, z) = E[f e (Y) exp(0S(Y, z))} < oo, 

where Y ~ Poisson (VF, 1). The extended model is 
constructed by "exponential tilting" of the original 
model by the statistic S. By (6), for fixed 9 and z, 
assuming differentiability of M with respect to (f> in 
a neighborhood of <f) = 0, the signed root of the score 
test statistic is approximated by 

(26) T = (5(x, z) - E d [S(X, z)])/^Var^(X, z)}, 

where 9 is the MLE under the null model, and the 
expectation and variance are with respect to the null 
model with density fx. 

Insight into the qualitative behavior of the ex- 
tended model (25) can be obtained by studying the 
perturbing model 

(27) g^ >z (x) = k^ >z exp((j)S(x,z)), 

provided this is a well-defined density with respect 
to Poisson(W, 1), where k^ z is the normalizing con- 
stant. When the null hypothesis is a homogeneous 
Poisson process, the extended model is identical to 
the perturbing model, up to a change in the first or- 
der term. In general, the extended model is a quali- 
tative hybrid between the null and perturbing mod- 
els. 

In this context the score test is equivalent to naive 
comparison of the observed and null-expected val- 
ues of the functional summary statistic S. The test 
statistic T in (26) may be difficult to evaluate; typi- 
cally, apart from Poisson models, the moments (par- 
ticularly the variance) of S would not be available 
in closed form. The null distribution of T would also 
typically be unknown. Hence, implementation of the 
score test would typically require moment approxi- 



mation and simulation from the null model, which in 
both cases may be computationally expensive. Var- 
ious approximations for the score or the score test 
statistic can be constructed, as discussed in the se- 
quel. 

6.2 Pseudo-Score of Extended Model 

The extended model (25) is an exponential family 
with respect to (j>, having Papangelou conditional 
intensity 

K dt( p tZ (u,x) = \ e (u,x)exp((j)A u S(x,z)), 

where Xg(u,x) is the Papangelou conditional inten- 
sity of the null model. The pseudo-score function 
with respect to </>, evaluated at = 0, is 

PU(M = EMt^f / A u S(x,z)X e (u,x)du, 

where the first term 

(28) ZAS(x,z) = J2^Mx,z) 

i 

will be called the pseudo-sum of S. If 9 is the maxi- 
mum pseudo-likelihood estimate (MPLE) under Hq, 
the second term with 9 replaced by 9 becomes 

(29) CAS(x,z) = / A u S(x,z)X § (u,x)du 

Jw 

and will be called the ( estimated) pseudo-compensator 
of S. We call 

TZAS(x,z) = PU(6,z) 

(30) 

= EA5(x,z) -CAS(x,z) 

the pseudo-residual since it is a weighted residual in 
the sense of [7]. 

The pseudo-residual serves as a surrogate for the 
numerator in the score test statistic (26). For the 
denominator, we need the variance of the pseudo- 
residual. Appendix B gives an exact formula (66) 
for the variance of the pseudo-score PU(6,z), which 
can serve as an approximation to the variance of 
the pseudo-residual TZAS(x,z). This is likely to be 
an overestimate, because the effect of parameter es- 
timation is typically to deflate the residual vari- 
ance [7]. 

The first term in the variance formula (66) is 

(31) C 2 AS(x,z)= [ [A u S(x,z)] 2 \ § (u,x)du, 

which we shall call the Poincare pseudo-variance be- 
cause of its similarity to the Poincare upper bound 
in (21). It is easy to compute this quantity along- 
side the pseudo-residual. Rough calculations in Sec- 
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tions 9.4 and 10.3 suggest that the Poincare pseudo- 
variance is likely to be the dominant term in the 
variance, except at small r values. The variance of 
residuals is also studied in [17]. 

For computational efficiency we propose to use the 
square root of (31) as a surrogate for the denomi- 
nator in (26). This yields a "standardized" pseudo- 
residual 

(32) TAS(x, z ) = TZAS(x, z)/ v / C 2 AS(x, z). 

We emphasize that this quantity is not guaranteed 
to have zero mean and unit variance (even approxi- 
mately) under the null hypothesis. It is merely a com- 
putationally efficient surrogate for the score test sta- 
tistic; its null distribution must be investigated by 
other means. Asymptotics of TAS(x, z) under 
a large-domain limit [69] could be studied, but limit 
results are unlikely to hold uniformly over r. In this 
paper we evaluate null distributions using Monte 
Carlo methods. 

The pseudo-sum (28) can be regarded as a func- 
tional summary statistic for the data in its own 
right. Its definition depends only on the choice of the 
statistic S, and it may have a meaningful interpre- 
tation as a nonparametric estimator of a property 
of the point process. The pseudo-compensator (29) 
might also be regarded as a functional summary 
statistic, but its definition involves the null model. 
If the null model is true, we may expect the pseudo- 
residual to be approximately zero. Sections 9-11 and 
Appendix A study particular instances of pseudo- 
residual diagnostics based on (28)-(30). 

In the conditional case, the Papangelou condi- 
tional intensity \g(u,x) must be replaced by \g(u, 
x°|x + ) given in (19) or (24). The integral in the 
definition of the pseudo-compensator (29) must be 
restricted to the domain W° , and the summation 
over data points in (28) must be restricted to points 
Xi S W° , that is, to summation over points of x°. 

6.3 Residuals 

A simpler surrogate for the score test is available 
when the canonical sufficient statistic S of the per- 
turbing model is naturally expressible as a sum of 
local contributions 

(33) S(x,z) = } j s(x i ,x- i ,z). 

i 

Note that any statistic can be decomposed in this 
way unless some restriction is imposed on s; such 
a decomposition is not necessarily unique. We call 
the decomposition "natural" if s(u,x,z) only de- 
pends on points of x that are close to u, as demon- 



strated in the examples in Sections 9, 10 and 11 and 
in Appendix A. 

Consider a null model with Papangelou conditional 
intensity Xg(u,x). Following [7], define the (s-weight- 
ed) innovation by 

(34) XS(x, r) = S(x, z) — \ s(u,x, z)Xg(u,x)du, 

Jw 

which by the GNZ formula (17) has mean zero un- 
der the null model. In practice, we replace 6 by an 
estimate 6 (e.g., the MPLE) and consider the (s- 
weighted) residual 

(35) lZS(x, z) = S(x, z) — / s(u,x, z)Xg(u,x) du. 

The residual shares many properties of the score 
function and can serve as a computationally efficient 
surrogate for the score. The data-dependent integral 

(36) CS(x,z) = / s(u,x, z)X^(u,x) du 

Jw 

is the (estimated) Papangelou compensator of S. 
The variance of lZS(x, z) can be approximated by 
the innovation variance, given by the general vari- 
ance formula (65) of Appendix B. The first term 
in (65) is the Poincare variance 

(37) C S(x,z) = / s(u,x,z) 2 \g(u,x)du. 

Jw 

Rough calculations reported in Sections 9.4 and 10.3 
suggest that the Poincare variance is likely to be the 
largest term in the variance for sufficiently large r. 
By analogy with (31) we propose to use the Poincare 
variance as a surrogate for the variance of lZS(x, z), 
and thereby obtain a "standardized" residual 

(38) TS(x,z)=TZS(x,z)/ v / C 2 S(x,z). 

Once again TS(x, z) is not exactly standardized, be- 
cause C 2 S(x, z) is an approximation to Var[72.5(x, z)\ 
and because the numerator and denominator of (38) 
are dependent. The null distribution of TS(x, z) 
must be investigated by other means. 

In the conditional case, the integral in the defi- 
nition of the compensator (36) must be restricted 
to the domain W° , and the summation over data 
points in (33) must be restricted to points X{ £ W°, 
that is, to summation over points of x°. 

7. DIAGNOSTICS FOR FIRST ORDER TREND 

Consider any null model with density f$(x) and 
Papangelou conditional intensity X$(u, x). By anal- 
ogy with Section 4 we consider alternatives of the 



10 



A. BADDELEY, E. RUBAK AND J. M0LLER 



form (25) where 



S(x,z) 



for some function s. The perturbing model (27) is 
a Poisson process with intensity exp(^>s(-, z)), 
where z is a nuisance parameter. The score test is 
a test for the presence of an (extra) first order trend. 
The pseudo-score and residual diagnostics are both 
equal to 



X{ , Z ) 



(39) 



s(u, z)Xx(u, x) du. 



This is the s-weighted residual described in [7]. The 
variance of (39) can be estimated by simulation, or 
approximated by the Poincare variance (37). 

If Z is a real-valued covariate function on W, then 
we may take s(u,z) = I{Z(u) < z] for 2£l, cor- 
responding to a threshold effect (cf. Section 4.2). 
A plot of (39) against z was called a lurking vari- 
able plot in [7]. 

If s(u, z) = k(u— z) for z GE?, where A: is a density 
function on M 2 , then 



, Jw 



du, 



which was dubbed the smoothed residual field in [7] . 
Examples of application of these techniques have 
been discussed extensively in [7]. 

8. INTERPOINT INTERACTION 

In the remainder of the paper we concentrate on 
diagnostics for interpoint interaction. 

8.1 Classical Summary Statistics 

Following Ripley's influential paper [64], it is stan- 
dard practice, when investigating association or de- 
pendence between points in a spatial point pattern, 
to evaluate functional summary statistics such as 
the ET-function, and to compare graphically the em- 
pirical summaries and theoretical predicted values 
under a suitable model, often a stationary Poisson 
process ("Complete Spatial Randomness," CSR) [23, 
29, 64]. 

The three most popular functional summary sta- 
tistics for spatial point processes are Ripley's K- 
function, the nearest neighbor distance distribution 
function G and the empty space function (spheri- 
cal contact distance distribution function) F. Defi- 
nitions of K, G and F and their estimators can be 



seen in [9, 23, 29, 50]. Simple empirical estimators 
of these functions are of the form 



(40) 



K(r) 



Kjr) 



P(*)\W\ ^ 



e K (xi,Xj)I{\\xi - Xj\\ < r}, 



(41) 



G(r)=G x (r) 



1 



(42) 



n(x) ^— J 
F(r) = F x (r) 



e G (xi,x_j,r)I{d(xj,x_i) < r}, 



1 



e^(ii, r)I{d(u, x) < r} du, 



\W\ j w 

where ex{u,v), ec{u,x,r) and ep(u,r) are edge cor- 
rection weights, and typically p 2 (x) =n(x)(n(x) — 
1)/\W\ 2 . 

8.2 Score Test Approach 

The classical approach fits naturally into the 
scheme of Section 6. In order to test for depen- 
dence between points, we choose a perturbing model 
that exhibits dependence. Three interesting exam- 
ples of perturbing models are the Strauss process, 
the Geyer saturation model with saturation thresh- 
old 1 and the area-interaction process, with interac- 
tion potentials Vs(x,r), Vg(x, r) and Va(x,t) given 
in (14)-(16). The nuisance parameter r > deter- 
mines the range of interaction. It is interesting to 
note that, although the Strauss density is integrable 
only when <f> < 0, the extended model obtained by 
perturbing fg by the Strauss density may be well- 
defined for some 4> > 0. This extended model may 
support alternatives that are clustered relative to 
the null, as originally intended by Strauss [73]. 

The potentials of these three models are closely re- 
lated to the summary statistics K, G and F in (40)- 

(42) . Ignoring the edge correction weights e(-), we 
have 

2\W\ 

(43) K x (r)^ - ' ' ^ Vs(x,t), 



(44) G x (r) 



n(x)(n(x 
1 



1) 



(45) 



F x {r) 



n(x) 
1 



W 



Vc(x,r), 



To draw the closest possible connection with the 
score test, instead of choosing the Strauss, Geyer or 
area-interaction process as the perturbing model, we 
shall take the perturbing model to be defined through 
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(27) where S is one of the statistics K, G or F. We 
call these the (perturbing) K -model, G -model and 
F -model, respectively. The score test is then pre- 
cisely equivalent to comparing K, G or F with its 
predicted expectation using (6). 

Essentially K, G, F are renormalized versions 
of Vs, Vjg, Va as shown in (43)-(45). In the case 
of F the renormahzation is not data-dependent, so 
the F-model is virtually an area-interaction model, 
ignoring edge correction. For K, the renormahza- 
tion depends only on n(x), and so, conditionally 
on n(x) = n, the X-model and the Strauss process 
are approximately equivalent. Similarly for G, the 
normalization also depends only on n(x), so, condi- 
tionally on n(x) = n, the 67-model and Geyer satu- 
ration process are approximately equivalent. If we 
follow Ripley's [64] recommendation to condition 
on n when testing for interaction, this implies that 
the use of the K, G or F-function is approximately 
equivalent to the score test of CSR against a Strauss, 
Geyer or area-interaction alternative, respectively. 

When the null hypothesis is CSR, we saw that the 
extended model (25) is identical to the perturbing 
model, up to a change in intensity, so that the use 
of the ET-function is equivalent to testing the null 
hypothesis of CSR against the alternative of a K- 
model; similarly for G and F. For a more general 
null hypothesis, the use of the -fC-function, for ex- 
ample, corresponds to adopting an alternative hy- 
pothesis that is a hybrid between the fitted model 
and a iT-model. 

Note that if the edge correction weight ex(u, v) 
is uniformly bounded, the fT-model is integrable for 
all values of <ft, avoiding a difficulty with the Strauss 
process [43]. 

Computation of the score test statistic (26) re- 
quires estimation or approximation of the null vari- 
ance of K(r), G(r) or F(r). A wide variety of ap- 
proximations is available when the null hypothesis is 
CSR [29, 65]. For other null hypotheses, simulation 
estimates would typically be used. A central limit 
theorem is available for K(r), G{r) and F(r) in the 
large-domain limit, for example, [3, 34, 35, 40, 65]. 
However, convergence is not uniform in r, and the 
normal approximation will be poor for small values 
of r. Instead Ripley [63] developed an exact Monte 
Carlo test [12, 36] based on simulation envelopes of 
the summary statistic under the null hypothesis. 

In the following sections we develop the resid- 
ual and pseudo-residual diagnostics corresponding 
to this approach. 



9. RESIDUAL DIAGNOSTICS FOR 
INTERACTION USING PAIRWISE DISTANCES 

This section develops residual (35) and pseudo- 
residual (30) diagnostics derived from a summary 
statistic S which is a sum of contributions depending 
on pairwise distances. 

9.1 Residual Based on Perturbing Strauss Model 

9.1.1 General derivation Consider any statistic of 
the general "pairwise interaction" form 

(46) S(x.,r) = ^2q({xi,Xj},r). 

i<j 

This can be decomposed in the local form (33) with 
s(u,x,r) = -^2q({xi,u},r), u<£x. 

i 

Hence, 

A Xi S(x,r) = 2s(xi,x_j,r) and 
A u S(x,r) = 2s(u,x,r), u £ x. 

Consequently, the pseudo-residual and the pseudo- 
compensator are just twice the residual and the Pa- 
pangelou compensator: 

(47) EAS(x, r) = 2S(x, r) = ]T q({ Xi , Xj }, r), 
CAS(x,r) = 2CS{x,r) 

(48) 

= / y2q({xi,u},r)X § (u,x)du, 
KAS(x, z) = TRS{x,r) 

(49) 

= 2S(x,r) -2C5(x,r). 

9.1.2 Residual of Strauss potential The Strauss in- 
teraction potential Vs of (14) is of the general 
form (46) with q({xi,Xj},r) = I{\\xi — Xj\\ <r}. 
Hence, Vs can be decomposed in the form (33) with 
s(u, x,r) = 7}t(u,x,r), where 

t(n,x,r) = y^IjUu — < r}, u ^ x. 

i 

Hence, the Papangelou compensator of Vs is 

(50) CV s {-K,r) = - [ t(u,x,r)Ag(n,x)dn. 
z Jw 

9.1.3 Case of CSR If the null model is CSR with 
intensity p estimated by p = n(x)/\W\ (the MLE, 
which agrees with the MPLE in this case), the Pa- 
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pangelou compensator (50) becomes 

CV r 5 (x,r) = |/ Vl{||n-x 4 || <r}dn 

= ^\WnB( Xi ,r)\. 

i 

Ignoring edge effects, we have \W fl B(xi,r)\ ~ 7rr 2 
and, applying (43), the residual is approximately 

(51) W 5 (x,r)^^[A x (r)-^r 2 ]. 

The term in brackets is a commonly-used measure 
of departure from CSR, and is a sensible diagnostic 
because A(r) = irr 1 under CSR. 

9.2 Residual Based on Perturbing AT-Model 

Assuming /> 2 (x) = p 2 (n(x)) depends only on n(x), 
the empirical A-function (40) can also be expressed 
as a sum of local contributions A x (r) = 
x_j,r) with 



k(u, x, r) 



P 2 (n(x) + l)\W\ 



■u ^ x, 



where 



t w (u,x,r) = ex(u, Xj)I{\\u — Xj\\ < r} 

3 



is a weighted count of the points of x that are r- 
close to the location u. Hence, the compensator of 
the A'-function is 

CA x (r) 



(52) 



p 2 (n(x) + l)\W\ 

t w (u, x, r)Xs(u, x) du. 



Assume the edge correction weight ex(u,v) = 
eK(v,u) is symmetric; for example, this is satisfied 
by the Ohser-Stoyan edge correction weight [57, 58] 
given by ck(u,v) = l/\W u nW v \ where W u = {u + v : 
v £ W}, but not by Ripley's [63] isotropic correction 
weight. Then the increment is, for u ^ x, 



+ 



p 2 (xU{u}) 
p 2 (xU{u})\W\ 



and when x; £ x 



+ 



P 2 (x^) 
2r(^,x_ t ,r) 
p 2 (*- t )\W\ ' 



Assuming the standard estimator p 2 (x) =n(n— 1)/ 
[W^ 2 with n = n(x), the pseudo-sum is seen to be 
zero, so the pseudo-residual is apart from the sign 
equal to the pseudo-compensator, which becomes 

2 f 

\x(u,x) du 



CAA x (r) = 2CA x (r)- 



n-2 



where CA x (r) is given by (52). So if the null model 
is CSR and the intensity is estimated by n/|W|, the 
pseudo-residual is approximately 2[A x (r) — CK x (r)], 
and, hence, it is equivalent to the residual approx- 
imated by (51). This is also the conclusion in the 
more general case of a null model with an activity 
parameter k, that is, where the Papangelou condi- 
tional intensity factorizes as 

A (u,x) =k£p(u,x), 

where 6 = (k,(3) and £^(0 is a Papangelou condi- 
tional intensity, since the pseudo-likelihood equa- 
tions then imply that n = J w Xs(u, x) du. 

In conclusion, the residual diagnostics obtained 
from the perturbing Strauss and A-models are very 
similar, the major difference being the data-depen- 
dent normalization of the A-function; similarly for 
pseudo-residual diagnostics which may be effectively 
equivalent to the residual diagnostics. In practice, 
the popularity of the A- function seems to justify 
using the residual diagnostics based on the perturb- 
ing A-model. Furthermore, due to the familiarity of 
the A-function, we often choose to plot the com- 
pensator^) of the fitted model(s) in a plot with the 
empirical A-function rather than the residual(s) for 
the fitted model. 

9.3 Edge Correction in Conditional Case 

In the conditional case, the Papangelou condi- 
tional intensity \§(u, x) is known only at locations 
u S W°. The diagnostics must be modified accord- 
ingly, by restricting the domain of summation and 
integration to W°. Appropriate modifications are 
discussed in Appendices C-E. 

9.4 Approximate Residual Variance Under CSR 

Here we study the residual variance and the ac- 
curacy of the Poincare variance approximation in 
a simple case. 

We shall approximate the residual variance 
Var [IZVs (X, r)] by the innovation variance 
Yax[XVs(X,r)] } that is, ignoring the effect of pa- 
rameter estimation. It is likely that this approxi- 
mation is conservative, because the effect of param- 
eter estimation is typically to deflate the residual 
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variance [7]. A more detailed investigation has been 
conducted in [17]. 

Assume the null model is CSR with intensity p 
estimated by p = n(x)/\W\. The exact variance of 
the innovation for the Strauss canonical statistic Vs 
is Var[ZVs(X, r)] = I\ + I2 from equation (65) of 
Appendix B, where 

h = - [ E[t(u,X,r) 2 X(u,X)]du 
4 Jw 



^E[i(u,X,r) 2 ]du 



and 
h 



1 

4 Jw 
P 2 



W,[l{\\u-v\\ <r}X 2 (u,v,X.)]dudv 



I{||u — v\\ < r}dudv 



w JW 



as A(u,X) = p and \2(u,v,X) = X(u,X.)X(v, 
XU {u}) = p 2 . This is reminiscent of expressions 
for the large-domain limiting variance of K under 
CSR obtained using the methods of [/-statistics [16, 
48, 65], summarized in [29], page 51 ff. Now Y = 
t(u, X,r) is Poisson distributed with mean p = 
p\B(u,r)DW\ so that E(Y 2 ) = p + p 2 . For u £ W er 
we have p = pirr 2 , so ignoring edge effects 



P , 



P 



I lf n^(u + i/)\W\ and h~^v\W\, 

where v = ptrr 2 . Note that since v is the expected 
number of points within distance r of a given point, 
a value of v = 1 corresponds to the scale of nearest- 
neighbor distances in the pattern, r nn = 1/ yJWp. For 
the purposes of the K function this is a "short" 
distance. Hence, it is reasonable to describe I\ as 
the "leading term" in the variance, since I\ S> I2 for 

Meanwhile, the Poincare variance (37) is 
n(x) 



CV s (x, 



£(«,x,r) du, 



Mw\j w 

which is an approximately unbiased estimator of I\ 
by Fubini's Theorem. Hence, 

ECV s (x,r) EC 2 y 5 (x,r) 
Var[W s (X,r)] ~ Var[Xy s (X, r)] 

h ^ l + v 
~ h + I 2 " 2 + v' 
Thus, as a rule of thumb, the Poincare variance un- 
derestimates the true variance; the ratio of means is 
(1 + i/)/(2 + u) > 1/2. The ratio falls to 2/3 when 
v = 1, that is, when r = r nn = 1/^/Wp. We can take 



this rule-of-thumb indicating the value of r be- 
low which the Poincare variance is a poor approxi- 
mation to the true variance. 

10. RESIDUAL DIAGNOSTICS FOR 
INTERACTION USING NEAREST NEIGHBOR 
DISTANCES 

This section develops residual and pseudo-residual 
diagnostics derived from summary statistics based 
on nearest neighbor distances. 

10.1 Residual Based on Perturbing Geyer Model 

The Geyer interaction potential Vc(x.,r) given 
by (15) is clearly a sum of local statistics (33), and 
its compensator is 



CV G ( X ,r) 



w 



I{d(u,x) < r}Xx(u, x) du. 



The Poincare variance is equal to the compensator 
in this case. Ignoring edge effects, Vc(x,r) is ap- 
proximately ra(x)G7 x (r); cf. (41). 

If the null model is CSR with estimated intensity 
k = n(x)/\W\, then 



CV G (x,r) 



Wn{jB(xi,r) 



ignoring edge effects, this is approximately «;|VF|F(r); 
cf. (42). Thus, the residual diagnostic is approxi- 
mately n(x)(G?(r) — F(r)). This is a reasonable di- 
agnostic for departure from CSR, since F = G under 
CSR. This argument lends support to Diggle's [27], 
equation (5.7), proposal to judge departure from 
CSR using the quantity sup \G — F\. 

This example illustrates the important point that 
the compensator of a functional summary statistic S 
should not be regarded as an alternative parametric 
estimator of the same quantity that S is intended 
to estimate. In the example just given, under CSR 
the compensator of G is approximately F, a quali- 
tatively different and in some sense "opposite" sum- 
mary of the point pattern. 

We have observed that the interaction potential Vq 
of the Geyer saturation model is closely related to G. 
However, the pseudo-residual associated to Vq is 
a more complicated statistic, since a straightforward 
calculation shows that the pseudo-sum is 

SAV/ G (x,r) 

= Vb(x,r) + 22 XT ^iW x i ~ x iW - r and 

d{xj,yi-i) > r}, 
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and the pseudo-compensator is 

CAV G (x,r) = I{d(u,x) <r}X e (u,x)du 
Jw 

+ ^I{d(xi,x_i) >r} 



I{||w — < r}Xx(u, x) du. 



10.2 Residual Based on Perturbing G-Model 

The empirical G- function (41) can be written 

(53) G x (r) = ^2g(xi,x-i,r), 



where 



(54) 



l{u,x,r) 



1 



-e G (u,x,r) 



n(x) + 1 
• l{d(u,x) < r}, utfix, 



so that the Papangelou compensator of the empirical 
G-function is 



CG x (r) 



g(u,x,r)Xs(u,x) du 



w 



n(x 



WC\\J i B{x i ,r) 



e G (u, x, r)Xg(u, x) du. 



The residual diagnostics obtained from the Geyer 
and G-models are very similar, and we choose to 
use the diagnostic based on the popular G-function. 
As with the if- function, we typically use the com- 
pensator^) of the fitted model(s) rather than the 
residual(s), to visually maintain the close connec- 
tion to the empirical G-function. 

The expressions for the pseudo-sum and pseudo- 
compensator of G are not of simple form, and we 
refrain from explicitly writing out these expressions. 
For both the G- and Geyer models, the pseudo-sum 
and pseudo-compensator are not directly related to 
a well-known summary statistic. We prefer to plot 
the pseudo-residual rather than the pseudo-sum and 
pseudo-compensator (s). 

10.3 Residual Variance Under CSR 

Again assume a Poisson process of intensity p as 
the null model. Since V G is a sum of local statistics, 

V G (x,r) = y^J{d(xj,x\xj) < r}, 



we can again apply the variance formula (65) of 
Appendix B, which gives Var[XVc(X, r)] = L\ + L 2 , 
where 

Li = p F{d(u,X.) < r}du 
Jw 



and 
L 2 = P 2 



F{\\u-v\\ <r, 



w Jw 



d(u, X) > r, d(v, X) > r} du dv. 



The Poincare variance is equal to the compensator 
in this case, and is 

C 2 V G (x,r)= I l{d(u,x) <r}X e (u,x)du 
Jw 

n(x) 



\W\ 



\WnU{x,r)\, 



where U(x, r) = \J- b{xi, r). The Poincare variance is 
an approximately unbiased estimator of the term L\. 

For u G W Qr we have P{d(u,X) < r} = 1 - 
exp(— pur 2 ) so that 

L 1 «p|^|(l-exp(-pvrr 2 )), 

ignoring edge effects. Again, let v = pirr 2 so that 
L\ w p|W|(l - exp(— v)). Meanwhile, 

F{d(u,X) >r, d{v,X)>r} 
= exp(— p\b(u, r) U b(v, r)\). 

This probability lies between exp(— v) and exp(— 2u) 
for all u, v. Thus (ignoring edge effects), 

L 2 w p 2 Trr 2 \W\ exp(-(l + S)v) 
= pu\W\exp(-(l + S)u), 

where < 5 < 1. Hence, 

L 2 < ise-» 



L x ~ 1 - e~ v 

Let f{v) = ve~ v /(\ - e~ v ). Then f(u) is strictly 
decreasing and f(y) < 1 for all v > so that L\j 
{L\ + L 2 ) > i, that is, the variance is underesti- 
mated by at most a factor of 2. Note that /(1.25) ~ 
0.5, so L\/(L\ + L 2 ) > I when r < r cr i t , where r cr i t = 
Y^L25/7rp. The conclusions and rule-of-thumb for 
TG are similar to those obtained for TK in Sec- 
tion 9.4. 
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11. DIAGNOSTICS FOR INTERACTION 
BASED ON EMPTY SPACE DISTANCES 

11.1 Pseudo-Residual Based on Perturbing 
Area-Interaction Model 

When the perturbing model is the area-interaction 
process, it is convenient to reparametrize the den- 
sity, such that the canonical sufficient statistic V A 
given in (16) is redefined as 

1 



Vk(x,r) 



\W\ 



Wn\jB( Xi ,r) 



This summary statistic is not naturally expressed as 
a sum of contributions from each point as in (33), 
so we shall only construct the pseudo-residual. Let 



U(x,r) = Wn\jB( 



Xi,r) 



The increment 
A u V A (pc,r) 
1 



W 



(|C/(xU{u},r)|-|[/(x,r)|), u £ x, 



can be thought of as "unclaimed space" — the pro- 
portion of space around the location u that is not 
"claimed" by the points of x. The pseudo-sum 

EAy A (x,r) = ^A a: .Vk(x,r) 

i 

is the proportion of the window that has "single 
coverage" — the proportion of locations in W that 
are covered by exactly one of the balls B(xi, r). This 
can be used in its own right as a functional summary 
statistic, and it corresponds to a raw (i.e., not edge 
corrected) empirical estimate of a summary func- 
tion F±(r) defined by 

F 1 (r)=F(#{xeX\d(u,x)<r} = l) 

for any stationary point process X, where u£M 2 is 
arbitrary. Under CSR with intensity p we have 

F\{r) = pirr 2 exp(— pirr 2 ). 

This summary statistic does not appear to be treated 
in the literature, and it may be of interest to study 
it separately, but we refrain from a more detailed 
study here. 

The pseudo-compensator corresponding to this 
pseudo-sum is 

CAV A (x,r) = / A u V A (x,r)X § (u,x)du. 
Jw 

This integral does not have a particularly simple 
interpretation even when the null model is CSR. 



11.2 Pseudo-Residual Based on 
Perturbing F-Model 

Alternatively, one could use a standard empirical 
estimator F of the empty space function F as the 
summary statistic in the pseudo-residual. The pseu- 
do-sum associated with the perturbing F-model is 

EAF x ( r )=n(x)F x (r)-VF x _,(r), 



i 



with pseudo-compensator 



CAF x (r) = / (F xU{u} (r)-F x (r))\ § (u,x)du. 
Jw 

Ignoring edge correction weights, F xU ^ u y(r) — F x (r) 
is approximately equal to A u V A (x, r), so the pseudo- 
sum and pseudo-compensator associated with the 
perturbing i^-model are approximately equal to the 
pseudo-sum and pseudo-compensator associated with 
the perturbing area-interaction model. Here, we usu- 
ally prefer graphics using the pseudo-compensator(s) 
and the pseudo-sum since this has an intuitive in- 
terpretation as explained above. 

12. TEST CASE: TREND WITH INHIBITION 

In Sections 12-14 we demonstrate the diagnostics 
on the point pattern data sets shown in Figure 1. 
This section concerns the synthetic point pattern in 
Figure 1(b). 

12.1 Data and Models 

Figure 1(b) shows a simulated realization of the 
inhomogeneous Strauss process with first order term 
\(x,y) = 200exp(2x + 2y + 3a; 2 ), interaction range 
R = 0.05, interaction parameter 7 = exp(<^>) = 0.1 
and W equal to the unit square; see (13) and (14). 
This is an example of extremely strong inhibition 
(negative association) between neighboring points, 
combined with a spatial trend. Since it is easy to 
recognize spatial trend in the data (either visually or 
using existing tools such as kernel smoothing [28]), 
the main challenge here is to detect the inhibition 
after accounting for the trend. 

We fitted four point process models to the data in 
Figure 1(b). They were (A) a homogeneous Poisson 
process (CSR); (B) an inhomogeneous Poisson pro- 
cess with the correct form of the first order term, 
that is, with intensity 

(55) p(x, y) = exp(A) + Pix + (3 2 y + (3 3 x 2 ), 

where /3o, • • • , 03 are real parameters; (C) a homoge- 
neous Strauss process with the correct interaction 
range R = 0.05; and (D) a process of the correct 
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(a) (b) 

Fig. 3. Residual diagnostics based on pairwise distances, for a model of the correct form fitted to the data in Figure 1(b) 
(a) Residual K -function and two-standard-deviation limits under the fitted model of the correct form, (b) Standardized resi 
K-function under the fitted model of the correct form. 



form, that is, inhomogeneous Strauss with the cor- 
rect interaction range R = 0.05 and the correct form 
of the first order potential (55). 

12.2 Software Implementation 

The diagnostics defined in Sections 9-11 were im- 
plemented in the R language, and has been publicly 
released in the spat st at library [6]. Unless other- 
wise stated, models were fitted by approximate max- 
imum pseudo-likelihood using the algorithm of [5] 
with the default quadrature scheme in spat st at, 
having an m x m grid of dummy points where m = 
max(25, 10[l + 2yn(x)/10]) was equal to 40 for most 
of our examples. Integrals over the domain W were 
approximated by finite sums over the quadrature 
points. Some models were refitted using a finer grid 
of dummy points, usually 80 x 80. In addition to 
maximum pseudo-likelihood estimation, the software 
also supports the Huang-Ogata [37] approximate 
maximum likelihood. 

12.3 Application of K Diagnostics 

12.3.1 Diagnostics for correct model First we fit- 
ted a point process model of the correct form (D). 
The fitted parameter values were 7 = 0.217 and f3 = 
(5.6, —0.46, 3.35, 2.05) using the coarse grid of dum- 
my points, and 7 = 0.170 and j3 = (5.6, —0.64,4.06, 
2.44) using the finer grid of dummy points, as against 
the true values 7 = 0.1 and (3 = (5.29, 2, 2, 3). 



Figure 2 in Section 1 shows K along with its com- 
pensator for the fitted model, together with the the- 
oretical X-function under CSR. The empirical K- 
function and its compensator coincide very closely, 
suggesting correctly that the model is a good fit. Fig- 
ure 3(a) shows the residual ET-function and the two- 
standard-deviation limits, where the surrogate stan- 
dard deviation is the square root of (37). Figure 3(b) 
shows the corresponding standardized residual K- 
function obtained by dividing by the surrogate stan- 
dard deviation. 

Although this model is of the correct form, the 
standardized residual exceeds 2 for small values of r. 
This is consistent with the prediction in Section 9.4 
that the variance approximation would be inaccu- 
rate for small r. The null model is a nonstationary 
Poisson process; the minimum value of the inten- 
sity is 200. Taking p = 200 and applying the rule of 
thumb in Section 9.4 gives 

r nn = -^JL= = 0.04, 
v / 200^ 

suggesting that the Poincare variance estimate be- 
comes unreliable for r < 0.04 approximately. 

Formal significance interpretation of the critical 
bands in Figure 3(b) is limited, because the null dis- 
tribution of the standardized residual is not known 
exactly, and the values ±2 are approximate point- 
wise critical values, that is, critical values for the 
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(a) (b) 

Fig. 4. Null distribution of standardized residual of K . Pointwise 2.5% and 97.5% quantiles (grey shading) and sample mean 
(dotted lines) of TK from 1000 simulated realizations of model (D) with estimated parameter values (a) 7 = 0.217 and fi = 
(5.6,-0.46,3.35,2.05) using a 40 x 40 grid of dummy points; (b) 7 = 0.170 and /3 = (5.6, -0.64, 4.06, 2.44) using a 80 x 80 grid. 



score test based on fixed r. The usual problems of 
multiple testing arise when the test statistic is con- 
sidered as a function of r; see [29], page 14. For very 
small r there are small-sample effects so that a nor- 
mal approximation to the null distribution of the 
standardized residual is inappropriate. 

To confirm this, Figure 4 shows the pointwise 2.5% 
and 97.5% quantiles of the null distribution of TK, 
obtained by extensive simulation. The sample mean 
of the simulated TK is also shown, and indicates 
that the expected standardized residual is nonzero 
for small values of r. Repeating the computation 
with a finer grid of quadrature points (for approx- 
imating integrals over W involved in the pseudo- 
likelihood and the residuals) reduces the bias, sug- 
gesting that this is a discretization artefact. 

12.3.2 Comparison of competing models Figu- 
re 5(a) shows the empirical K- function and its com- 
pensator for each of the models (A)-(D) in Sec- 
tion 12.1. Figure 5(b) shows the corresponding resid- 
ual plots, and Figure 5(c) the standardized residuals. 
A positive or negative value of the residual suggests 
that the data are more clustered or more inhibited, 
respectively, than the model. The clear inference is 
that the Poisson models (A) and (B) fail to capture 
interpoint inhibition at range r ~ 0.05, while the ho- 
mogeneous Strauss model (C) is less clustered than 
the data at very large scales, suggesting that it fails 



to capture spatial trend. The correct model (D) is 
judged to be a good fit. 

The interpretation of this example requires some 
caution, because the residual ^-function of the fit- 
ted Strauss models (C) and (D) is constrained to 
be approximately zero at r = R = 0.05. The max- 
imum pseudo-likelihood fitting algorithm solves an 
estimating equation that is approximately equiva- 
lent to this constraint, because of (43). 

It is debatable which of the presentations in Fig- 
ure 5 is more effective at revealing lack of fit. A com- 
pensator plot such as Figure 5(a) seems best at cap- 
turing the main differences between competing mod- 
els. It is particularly useful for recognizing a gross 
lack of fit. A residual plot such as Figure 5(b) seems 
better for making finer comparisons of model fit, 
for example, assessing models with slightly different 
ranges of interaction. A standardized residual plot 
such as Figure 5(c) tends to be highly irregular for 
small values of r, due to discretization effects in the 
computation and the inherent nondifferentiability of 
the empirical statistic. In difficult cases we may ap- 
ply smoothing to the standardized residual. 

12.4 Application of G Diagnostics 

12.4.1 Diagnostics for correct model Consider 
again the model of the correct form (D). The resid- 
ual and compensator of the empirical nearest neigh- 
bor function G for the fitted model are shown in 
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Fig. 5. Model diagnostics based on pairwise distances, for each of the models (A)-(D) fitted to the data in Figure 1(b). 
(a) K and its compensator under each model, (b) Residual K -function (empirical minus compensator) under each model. 
(c) Standardized residual K -function under each model. 



Figure 6. The residual plot suggests a marginal lack 
of fit for r < 0.025. This may be correct, since the fit- 
ted model parameters (Section 12.3.1) are marginally 
poor estimates of the true values, in particular, of 
the interaction parameter. This was not reflected so 
strongly in the K diagnostics. This suggests that the 
residual of G may be particularly sensitive to lack 
of fit of interaction. 

Applying the rule of thumb in Section 10.3, we 
have r cr it = 0.044, agreeing with the interpretation 



that the ±2 limits are not trustworthy for r < 0.05 
approximately. 

Figure 7 shows the pointwise 2.5% and 97.5% quan- 
tiles of the null distribution of TG. Again, there is 
a suggestion of bias for small values of r which ap- 
pears to be a discretization artefact. 

12.4.2 Comparison of competing models For each 
of the four models, Figure 8(a) shows G and its 
Papangelou compensator. This clearly shows that 
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Fig. 6. Residual diagnostics obtained from the perturbing G-model when the data pattern is a realization of an mhomogeneous 
Strauss process, (a) G and its compensator under a fitted model of the correct form, and theoretical G-function for a Poisson 
process, (b) Residual G-function and two-standard-deviation limits under the fitted model of the correct form. 



the Poisson models (A) and (B) fail to capture in- 
terpoint inhibition in the data. The Strauss mod- 
els (C) and (D) appear virtually equivalent in Fig- 



ure 



Figure 8(b) shows the standardized residual of G, 
and Figure 8(c) the pseudo-residual of Vq (he., the 
pseudo-residual based on the pertubing Geyer mo- 
del), with spline smoothing applied to both plots. 
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Fig. 7. Null distribution of standardized residual of G. Pointwise 2.5% and 97.5% quantiles (grey shading) and sam- 
ple mean (dotted lines) from 1000 simulated realizations of model (D) with estimated parameter values (a) 7 = 0.217 
and P = (5.6,-0.46,3.35,2.05) using a 40 x 40 grid of dummy points; (b) 7 = 0.170 and $ = (5.6,-0.64,4.06,2.44) using 
a 80 x 80 grid. 
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Fig. 8. Diagnostics based on nearest neighbor distances, for the models (A) -(D) fitted to the data in Figure 1(b). (a) Compen- 
sator for G. (b) Smoothed standardized residual of G. (c) Smoothed pseudo-residual derived from a perturbing Geyer model. 



The Strauss models (C) and (D) appear virtually 
equivalent in Figure 8(c). The standardized residual 
plot Figure 8(b) correctly suggests a slight lack of 
fit for model (C) while model (D) is judged to be 
a reasonable fit. 

12.5 Application of F Diagnostics 

Figure 9 shows the pseudo-residual diagnostics 
based on empty space distances. Both diagnostics 
clearly show models (A)-(B) are poor fits to data. 
However, in Figure 9(a) it is hard to decide which 



of the models (C)-(D) provide a better fit. Despite 
the close connection between the area-interaction 
process and the F-model, the diagnostic in Figu- 
re 9(b) based on the F-model performs better in 
this particular example and correctly shows (D) is 
the best fit to data. In both cases it is noticed that 
the pseudo-sum has a much higher peak than the 
pseudo-compensators for the Poisson models (A)- 
(B), correctly suggesting that these models do not 
capture the strength of inhibition present in the 
data. 
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13. TEST CASE: CLUSTERING WITHOUT 
TREND 

13.1 Data and Models 

Figure 1(c) is a realization of a homogeneous Geyer 
saturation process [31] on the unit square, with first 
order term A = exp(4), saturation threshold s = 4.5 
and interaction parameters r = 0.05 and 7 = 
exp(0.4) « 1.5, that is, the density is 

(56) /(x) ocexp(n(x)logA + VG, s (x,r)log7), 

where 

V G , s {x,r) = ^mirJ s, ^ I{||x; - Xj\\ <r} L 

This is an example of moderately strong clustering 
(with interaction range R = 2r = 0.1) without trend. 
The main challenge here is to correctly identify the 
range and type of interaction. 

We fitted three point process models to the data: 
(E) a homogeneous Poisson process (CSR); (F) a ho- 
mogeneous area-interaction process with disc radius 
r = 0.05; (G) a homogeneous Geyer saturation pro- 
cess of the correct form, with interaction parame- 
ter r = 0.05 and saturation threshold s = 4.5 while 
the parameters A and 7 in (56) are unknown. The 
parameter estimates for (G) were logA = 4.12 and 
log7 = 0.38. 



13.2 Application of K Diagnostics 

A plot (not shown) of the K- function and its com- 
pensator, under each of the three models (E)-(G), 
demonstrates clearly that the homogeneous Poisson 
model (E) is a poor fit, but does not discriminate 
between the other models. 

Figure 10 shows the residual K and the smoothed 
standardized residual K for the three models. These 
diagnostics show that the homogeneous Poisson mo- 
del (E) is a poor fit, with a positive residual suggest- 
ing correctly that the data are more clustered than 
the Poisson process. The plots suggests that both 
models (F) and (G) are considerably better fits to 
the data than a Poisson model. They show that (G) 
is a better fit than (F) over a range of r values, and 
suggest that (G) captures the correct form of the 
interaction. 

13.3 Application of G Diagnostics 

Figure 11 shows G and its compensator, and the 
corresponding residuals and standardized residuals, 
for each of the models (E)-(G) fitted to the clus- 
tered point pattern in Figure 1(c). The conclusions 
obtained from Figure 11(a) are the same as those 
in Section 13.2 based on K and its compensator. 
Figure 12 shows the smoothed pseudo-residual di- 
agnostics based on the nearest neighbor distances. 
The message from these diagnostics is very similar 
to that from Figure 11. 
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Fig. 10. Model diagnostics based on pairwise distances for each of the models (E)-(G) fitted to the data in Figure 1(c). 
(a) Residual K ; (b) smoothed standardized residual K . 



Models (F) and (G) have the same range of in- 
teraction R = 0.1. Comparing Figures 10 and 11, 
we might conclude that the G-compensator appears 
less sensitive to the form of interaction than the K- 
compensator. Other experiments suggest that G is 
more sensitive than K to discrepancies in the range 
of interaction. 



13.4 Application of F Diagnostics 

Figure 13 shows the pseudo-residual diagnostics 
based on the empty space distances, for the three 
models fitted to the clustered point pattern in Fig- 
ure 1(c). In this case diagnostics based on the area- 
interaction process and the F-model are very simi- 




FlG. 11. Model diagnostics based on nearest neighbor distances for each of the models (E)-( G) fitted to the data in Figure 1(c). 
(a) G and its compensator under each model; (b) smoothed standardized residual G. 
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Fig. 12. Smoothed pseudo-residuals for each of the models (E)-(G) fitted to the clustered point pattern in Figure 1(c) when 
the perturbing model is (a) the Geyer saturation model with saturation 1 and (b) the G-model. 



lar, as expected due to the close connection between 
the two diagnostics. Here it is very noticeable that 
the pseudo-compensator for the Poisson model has 
a higher peak than the pseudo-sum, which correctly 
indicates that the data is more clustered than a Pois- 
son process. 

14. TEST CASE: JAPANESE PINES 

14.1 Data and Models 

Figure 1(a) shows the locations of seedlings and 
saplings of Japanese black pine, studied by Numata 
[53, 54] and analyzed extensively by Ogata and Tane- 
mura [55, 56]. In their definitive analysis [56] the fit- 
ted model was an inhomogeneous "soft core" pair- 
wise interaction process with log-cubic first order 
term \p(x,y) = exp(P/3(x, y)), where Pp is a cubic 
polynomial in x and y with coefficient vector (3, and 
density 

/09,ff3) ( x ) = c (/3,<x 2 ) exp (j>2 Pp(xi) 
(57) 

-5>Vii^ -*#)). 
i<j * 

where a 2 is a positive parameter. 

Here we evaluate three models: (H) an inhomoge- 
neous Poisson process with log-cubic intensity; 
(I) a homogeneous soft core pairwise interaction pro- 
cess, that is, when Pp(x,y) in (57) is replaced by 



a real parameter; (J) the Ogata-Tanemura mo- 
del (57). For more detail on the data set and the 
fitted inhomogeneous soft core model, see [7, 56]. 

A complication in this case is that the soft core 
process (57) is not Markov, since the pair poten- 
tial c(u,v) = exp(— <7 4 /||u — v || 4 ) is always positive. 
Nevertheless, since this function decays rapidly, it 
seems reasonable to apply the residual and pseudo- 
residual diagnostics, using a cutoff distance R such 
that | log c(it, | < e when ||n — u|| < R, for a spec- 
ified tolerance e. The cutoff depends on the fitted 
parameter value a 2 . We chose e = 0.0002, yielding 
R = 1 . Estimated interaction parameters were a 2 = 
0.11 for model (I) and a 2 = 0.12 for model (J). 

14.2 Application of K Diagnostics 

A plot (not shown) of K and its compensator for 
each of the models (H)-(J) suggests that the homo- 
geneous soft core model (I) is inadequate, while the 
inhomogeneous models (H) and (J) are reasonably 
good fits to the data. However, it does not discrim- 
inate between the models (H) and (J). 

Figure 14 shows smoothed versions of the residual 
and standardized residual of K for each model. The 
Ogata-Tanemura model (J) is judged to be the best 
fit. 

14.3 Application of G diagnostics 

Finally, for each of the models (H)-(J) fitted to 
the Japanese pines data in Figure 1(a), Figure 15(a) 
shows G? and its compensator. The conclusions are 
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Fig. 13. Pseudo-sum and pseudo-compensators for the models (E)-(G) fitted to the clustered point pattern in Figure 1(c) 
when the perturbing model is (a) the area-interaction process and (b) the F-model. 



the same as those based on K shown in Figure 14. 
Figure 16 shows the pseudo-residuals when using 
either a perturbing Geyer model [Figure 16(a)] or 
a perturbing G-model [Figure 16(b)]. Figures 16(a)- 
(b) tell almost the same story: the inhomogeneous 
Poisson model (H) provides the worst fit, while it 
is difficult to discriminate between the fit for the 
soft core models (I) and (J). In conclusion, consid- 
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ering Figures 14, 15 and 16, the Ogata-Tanemura 
model (J) provides the best fit. 

14.4 Application of F diagnostics 

Finally, the empty space pseudo-residual diagnos- 
tics are shown in Figure 17 for the Japanese Pines 
data in Figure 1(a). This gives a clear indication 
that the Ogata-Tanemura model (J) is the best fit 
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Fig. 14. Model diagnostics based on pairwise distances for each of the models (H)-(J) fitted to the Japanese pines data in 
Figure 1(a). (a) Smoothed residual K; (b) smoothed standardized residual K. 
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to the data, and the data pattern appears to be 
too regular compared to the Poisson model (H) and 
not regular enough for the homogeneous softcore 
model (I). 

15. SUMMARY OF TEST CASES 

In this section we discuss which of the diagnostics 
we prefer to use based on their behavior for the three 
test cases in Sections 12-14. 

Typically, the various diagnostics supplement each 
other well, and it is recommended to use more than 



one diagnostic when validating a model. It is well 
known that K is sensitive to features at a larger scale 
than G and F. Compensator and pseudo-compensa- 
tor plots are informative for gaining an overall pic- 
ture of model validity, and tend to make it easy 
to recognize a poor fit when comparing competing 
models. To compare models which fit closely, it may 
be more informative to use (standardized) residu- 
als or pseudo-residuals. We prefer to use the stan- 
dardized residuals, but it is important not to over- 
interpret the significance of departure from zero. 
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Fig. 16. Smoothed pseudo-residuals for each of the models (H)-(J) fitted to the Japanese pines data in Figure 1(a) when the 
perturbing model is (a) the Geyer saturation model with saturation 1 (null fitted on a fine grid) and (b) the G-model. 



Based on the test cases here, it is not clear whether 
diagnostics based on pairwise distances, nearest 
neighbor distances, or empty space distances are 
preferable. However, for each of these we prefer to 
work with compensators and residuals rather than 
pseudo-compensators and pseudo-residuals when pos- 
sible (i.e., it is only necessary to use pseudo- versions 
for diagnostics based on empty space distances) . For 

o 




(a) 

Fig. 17. Pseudo-sum and pseudo-compensators for the n 
the perturbing model is (a) the area-interaction process and 



instance, for the first test case (Section 12) the best 
compensator plot is that in Figure 5(a) based on 
pairwise distances (K and CK) which makes it easy 
to identify the correct model. On the other hand, 
in this test case the best residual type plot is that 
in Figure 8(b) based on nearest neighbor distances 
(TG) where the correct model is the only one within 
the critical bands. However, in the third test case 
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(Section 14) the best compensator plot is one of 
the plots in Figure 17 with pseudo-compensators 
based on empty space distances (EAVa and CAVa 
or EAF and CAF, respectively) which clearly indi- 
cates which model is correct. 

In the first and third test cases (Sections 12 
and 14), which both involve inhomogeneous models, 
it is clear that K and its compensator are more sen- 
sitive to lack of fit in the first order term than G and 
its compensator [compare, e.g., the results for the 
homogeneous model (C) in Figures 5(a) and 8(b)]. 
It is our general experience that diagnostics based 
on K are particularly well suited to assess the pres- 
ence of interaction and to identify the general form 
of interaction. Diagnostics based on K and, in par- 
ticular, on G seem to be good for assessing the range 
of interaction. 

Finally, it is worth mentioning the computational 
difference between the various diagnostics (timed on 
a 2.5 GHz laptop). The calculations for K and CK 
used in Figure 2 are carried out in approximately 
five seconds, whereas the corresponding calculations 
for G and CG only take a fraction of a second. For 
EAF and CAF, for example, the calculations take 
about 45 seconds. 

16. POSSIBLE EXTENSIONS 

The definition of residuals and pseudo-residuals 
should extend immediately to marked point pro- 
cesses. For space-time point processes, residual di- 
agnostics can be defined using the spatiotemporal 
conditional intensity (i.e., given the past history). 
Pseudo-residuals are unnecessary because the likeli- 
hood of a general space-time point process is a prod- 
uct integral (Mazziotto-Szpirglas identity). In the 
space-time case there is a martingale structure in 
time, which gives more hope of rigorous asymptotic 
results in the temporal (long-run) limit regime. 

Residuals can be derived from many other sum- 
mary statistics. Examples include third-order and 
higher-order moments (Appendix A.l), tessellation 
statistics (Appendix A. 2), and various combinations 
of F, G and K. 

In the definition of the extended model (25) the 
canonical statistic S could have been allowed to de- 
pend on the nuisance parameter 6, but this would 
have complicated our notation and some analysis. 

APPENDIX A: FURTHER DIAGNOSTICS 

In this appendix we present other diagnostics 
which we have not implemented in software, and 



which therefore are not accompanied by experimen- 
tal results. 

A.l Third and Higher Order Functional 
Summary Statistics 

While the intensity and X-function are frequently- 
used summaries for the first and second order mo- 
ment properties of a spatial point process, third and 
higher order summaries have been less used [49, 67, 
70, 72]. 

Statistic of order k For a functional summary sta- 
tistic of fcth order, say, 

(58) S(x,r) = ^2 q({x h ,...,x ik },r), 

we obtain 

£AS(x,r) 

(59) =k\S(x,r) 

= kl q({x h ,...,x ik },r), 

CAS(x,r) 
= k\CS(x,r) 

(60) =(Jfe-l)! 

• / J2 q({x il ,...,x ik _ 1 ,u},r) 

• Xg(u, x) du, 

PU(0,r) 

(61) 

= klKS(x,r) = klSfcr) - fc!CS(x,r), 

where 11,12,... are pairwise distinct in the sums 
in (59)-(60). In this case again, pseudo-residual di- 
agnostics are equivalent to those based on residuals. 

Third order example For a stationary and isotropic 
point process (i.e., when the distribution of X is 
invariant under translations and rotations), the in- 
tensity and -function of the process completely 
determine its first and second order moment prop- 
erties. However, even in this case, the simplest de- 
scription of third order moments depends on a three- 
dimensional vector specified from triplets (xi,Xj,Xk) 
of points from X such as the lengths and angle be- 
tween the vectors — Xj and Xj — x^. This is often 
considered too complex, and instead one considers 
a certain one-dimensional property of the triangle 
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T(xi,Xj,Xk) as exemplified below, where L(xi,Xj,Xk) 
denotes the largest side in T(xi,Xj,Xk)- 

Let the canonical sufficient statistic of the per- 
turbing density (27) be 

5(x,r) = V T (x,r) 

(62) 

= ^ HL(xi,Xj,x k ) <r}. 

i<j<k 

The perturbing model is a special case of the triplet 
interaction point process studied in [31]. It is also 
a special case of (58) with 

q({xi,Xj,x k },r) =I{L(xi,Xj,x k ) < r}; 

residual and pseudo-residual diagnostics are equiva- 
lent and given by (59)-(61). 

A. 2 Tessellation Functional Summary Statistics 

Some authors have suggested the use of tessella- 
tion methods for characterizing spatial point pro- 
cesses [38]. A planar tessellation is a subdivision of 
planar region such as W or the entire plane M. 2 . 
For example, consider the Dirichlet tessellation of W 
generated by x, that is, the tessellation with cells 

C(xi\x) = {u £ W\ \\u — Xi\\ < \\u — Xj\\ 

for all Xj in x}, 

i = 1, . . . , n. 

Suppose the canonical sufficient statistic of the per- 
turbing density (27) is 

(63) S(x,r) = y G (x, r) = ^I{|C(^|x)| < r}. 

i 

This is a sum of local contributions as in (33), al- 
though not of local statistics in the sense mentioned 
in Section 6.3, since I{|C(xj|x)| < r} depends on 
those points in x_j which are Dirichlet neighbors 
to Xi and such points may of course not be r-close 
to Xi (unless r is larger than the diameter of W). 
We call this perturbing model a soft Ord process; 
Ord's process as defined in [10] is the limiting case 
4> — > — oo in (27), that is, when r is the lower bound 
on the size of cells. Since Vb(x) < n(x), the perturb- 
ing model is well-defined for all <f> £ M. 

Let ~ x denote the Dirichlet neighbor relation for 
the points in x, that is, Xi ~ x Xj if C(xj|x) n 
C{xj\x) 7^ 0. Note that X{ ~ x X{. Now, 

A u S(x,r) =I{\C(u\xU {u})\ <r} 

(64) + [I{\C(v\-xU{u})\<r} 

-I{|C(t;|x\M)|<r}] 



depends not only on the points in x which are Dirich- 
let neighbors to u (with respect to ~ x u{«}) Du t a l so 
on the Dirichlet neighbors to those points (with re- 
spect to ~ x u{m} or with respect to ~ X \{ M }). In other 
words, if we define the iterated Dirichlet neighbor re- 
lation by that Xi ~ x Xj if there exists some x k such 
that Xi ~ x Xk and Xj ~ x Xf~, then t(u, x) depends on 
those points in x which are iterated Dirichlet neigh- 
bors to u with respect to ~ x u{ u } or with respect to 
~x\{«}- The pseudo-sum associated to the soft Ord 
process is 

SAF (x,r) = F (x,r) 

+ E E Ml^-|x)|<r} 

-I{|C(x,|x_,)|<r}] 

and from (29) and (64) we obtain the pseudo-com- 
pensator. From (36) and (63), we obtain the Papan- 
gelou compensator 

CV (x,r)= f I{\C(u\xU{u})\ <r}A e ~(n,x)dn. 

Many other examples of tessellation characteris- 
tics may be of interest. For example, often the De- 
launay tessellation is used instead of the Dirich- 
let tessellation. This is the dual tessellation to the 
Dirichlet tessellation, where the Delaunay cells gen- 
erated by x are given by those triangles T(xj, Xj,x k ) 
such that the disc containing Xj, Xj,x k in its bounda- 
ry does not contain any further points from x (strict- 
ly speaking we need to assume a regularity condi- 
tion, namely, that x has to be in general quadratic 
position; for such details, see [10]). For instance, the 
summary statistic £(x, r) given by the number of 
Delaunay cells T(x{,Xj,Xk) with L(xi,xj,Xk) <r, 
where L(u,v,w) is the length of the triangle with 
vertices u, v, w, is a kind of third order statisticis re- 
lated to (62) but concerns only the maximal cliques 
of Dirichlet neighbors (assuming again the general 
quadratic position condition). The corresponding 
perturbing model has not been studied in the lit- 
erature, to the best of our knowledge. 

APPENDIX B: VARIANCE FORMULAE 

This appendix concerns the variance of diagnostic 
quantities of the form 

1=2 h(xi,~K-i) — / h(u,x.)\g(u,~K) du, 
R = h(xj, X_j) — / h(u,x)Xg(u, X)d«, 
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where h(-) is a functional for which these quan- 
tities are almost surely finite, X is a point pro- 
cess on W with Papangelou conditional intensity 
X$(u, X) and 9 is an estimate of 9 (e.g., the MPLE). 

B.l General Identity 

Exact formulae for the variance of the innovation / 
and residual R are given in [4] . Expressions for Vari? 
are unwieldy [4], Section 6, but to a first approxi- 
mation we may ignore the effect of estimating 9 and 
consider the variance of /. Suppressing the depen- 
dence on 9, this is ([4], Proposition 4), 



APPENDIX C: MODIFIED EDGE 
CORRECTIONS 

Appendices C-E describe modifications to the stan- 
dard edge corrected estimators of K(r) and G(r) 
required in the conditional case (Section 2.3) be- 
cause the Papangelou conditional intensity X(u,x) 
can or should only be evaluated at locations u S W° 
where W° C W. Corresponding compensators are 
also given. 

Assume the point process is Markov and we are in 
the conditional case as described in Section 5.4. Con- 
sider an empirical functional statistic of the form 



(65) 



Var/= / E[/i(u,X) 2 A(u,X)]du 
Jw 



(67) S w (x,r) 



s w (xi,x\{xi},r) 



w 2 



K[A(u v X) + B{u v X)] dndu with compensator (in the unconditional case) 



where 

A(u, v, X) = A u h(v, X)A v h(u, X)A2(it, v , X), 
B(u,v,X.) = h(u,X)h(v,X.) 

■ {\{u,X)\{v,X.) - \ 2 (u,v,X.)}, 

where \2(u,v,x) = X(u,x)\(v,xU{u}) is the second 
order Papangelou conditional intensity. Note that 
for a Poisson process -B(u,t>,X) is identically zero. 

B.2 Pseudo-Score 

Let S(x, z) be a functional summary statistic with 
function argument z. Take h{u,~K) = A u S(x,z). 
Then the innovation / is the pseudo-score (23), and 
the variance formula (65) becomes 

Var[PU(#)] 

E[(A u 5(X,z)) 2 A(u,X)]dn 

w 

(66) + / E[(A u A v S(X,z)) 2 \ 2 (u,v,X)]dudv 
Jw 2 

+ / E[A u S(x,z)A v S(x,z) 
Jw 2 

■ {\(u, X)A(u, X) — \2(u, v, X)}] dudv, 
where for u^v and {u, v} n x = 0, 

A u A v S(x, z) = S(x U {u, v}, z)-S(xU {u}, z) 
- S(xU{v},z) + S{x,z) 
satisfies A u A v S(x, z) = A v A u S(x, z). 



C£V(x, r) = / sw(u,x, r)X^(u,x)du. 
Jw 

We explore two different strategies for modifying the 
edge correction. 

In the restriction approach, we replace W by W° 
and x by x° = x Pi W° , yielding 



S W o(x,r)= ^ s W o(xi,x°\{xi}, 



(68) 



C5vi/°(x,r)= / SM/°(n,x°,r)Ag(u,x°|x + ) du. 
Jw° 

Data points in the boundary region W + are ignored 
in the calculation of the empirical statistic Sw° ■ The 
boundary configuration x + = xfl W + contributes 
only to the estimation of 9 and the calculation of the 
Papangelou conditional intensity A^(-, -|x + ). This has 
the advantage that the modified empirical statis- 
tic (68) is identical to the standard statistic S com- 
puted on the subdomain W°; it can be computed 
using existing software, and requires no new theo- 
retical justification. The disadvantage is that we lose 
information by discarding some of the data. 

In the reweighting approach we retain the bound- 
ary points and compute 

Sw°,w(x-,r)= ^2 s w°,w{xi,x\{xi},r), 

CSw°,w( x , r )= / sw°,w(u,x,r)X^(u,x°\x + )du, 
Jw° 

where sw°,w(') is a modified version of s\v{-)- Bound- 
ary points contribute to the computation of the mod- 
ified summary statistic S\y° w an d its compensator. 
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The modification is designed so that Sw°,w has 
properties analogous to Sw- 

The K-function and G-function of a point pro- 
cess Y in M. 2 are defined [63, 64] under the assump- 
tion that Y is second order stationary and strictly 
stationary, respectively. The standard estima- 
tors K\y(r) and G x (r) of the K- function and G- 
function, respectively, are designed to be approxi- 
mately pointwise unbiased estimators when applied 

toX = Ynw. 

We do not necessarily assume stationarity, but 
when constructing modified summary statistics 
Kw°,w{ r ) and Gw°,w(r), we shall require that they 
are also approximately pointwise unbiased estima- 
tors of K(r) and G(r), respectively, when Y is sta- 
tionary. This greatly simplifies the interpretation of 
plots of Kyi/o W {r) and Gw°,w{ r ) and their compen- 
sators. 

APPENDIX D: MODIFIED EDGE 
CORRECTIONS FOR THE K-FUNCTION 

D.l Horvitz-Thompson Estimators 

The most common nonparametric estimators of 
the K- function [9, 57, 63] are continuous Horvitz- 
Thompson type estimators [8, 20] of the form 



(69) 



K(r) = K w (r) 



P 2 (*)\W\ j 



Xj\\ < r}. 



Here p 2 = p 2 (x) should be an approximately unbi- 
ased estimator of the squared intensity p 2 under 
stationarity. Usually p 2 (x) = n(n — 1)/\W\ 2 where 
n = n(x). 

The term ew(u,v) is an edge correction weight, 
depending on the geometry of W, designed so that 
the double sum in (69), say, Y(r) = p 2 (x)\W\K(r), 
is an unbiased estimator of Y(r) = p 2 \W\K(r). Pop- 
ular examples are the Ohser-Stoyan translation edge 
correction with 



(70) 



e w (u,v) 



\W\ 



|Wn(W + («-i;))| 
and Ripley's isotropic correction with 

ew(u,v) = ew(u,v) 

2tt\\u — v\\ 



(71) 



Estimators of the form (69) satisfy the local decom- 
position (67) where 

i 



p 2 {x\j{u})\W\ 
■'^2ew(u,Xj)I{\\u — Xj\\ <r}, u £ x. 

3 

Now we wish to modify (69) so that the outer sum- 
mation is restricted to data points Xj in W° C W, 
while retaining the property of unbiasedness for sta- 
tionary and isotropic point processes. The restric- 
tion estimator is 



K W o(r) 



(72) 



£2( x o)|Wo| 

■^2 e W°(xi,Xj)I{\\xi -Xj\\ < r}, 



where the edge correction weight is given by (70) 
or (71) with W replaced by W° . A more efficient 
alternative is to replace (69) by the reweighting es- 
timator 



(73) 



K W o W (r) 
1 



^(x)!^ ! 



^2 ^ e W°,w(xi,Xj)I{\\xi - xj\\ < r}, 



length(9J3(«, ||u-t;||)nW)' 



where ew°,w( u , v ) is a modified version of ew(-) 
constructed so that the double sum in (73) is unbi- 
ased for Y(r). Compared to the restriction estima- 
tor (72), the reweighting estimator (73) contains ad- 
ditional contributions from point pairs (xj, Xj) where 
x i G x° and xj G x + . 

The modified edge correction factor ew°,w(') 
for (73) is the Horvitz-Thompson weight [9] in an 
appropriate sampling context. Ripley's [63, 64] iso- 
tropic correction (71) is derived assuming isotropy, 
by Palm conditioning on the location of the first 
point Xi, and determining the probability that Xj 
would be observed inside W after a random rotation 
about Xi. Since the constraint on xj is unchanged, 
no modification of the edge correction weight is re- 
quired, and we take ew°,w{') — ew(') as in (71). 
Note, however, that the denominator in (73) is 
changed from \W\ to \ W°\. 

The Ohser-Stoyan [58] translation correction (70) 
is derived by considering two-point sets ( 
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pled under the constraint that both x\ and Xj are in- 
side W. Under the modified constraint that Xi £ W° 
and Xj £W, the appropriate edge correction weight 
is 

ew°,w{u, v) = ew°,w{u — v) 

_ \W n {W° + (u - v))\ 

~ \W°\ 

so that l/ew° ,w{ z ) is the fraction of locations u 
in W° such that u + W. 

D.2 Border Correction 

A slightly different creature is the border corrected 
estimator [using usual intensity estimator p = n(x)/ 
\W\] 

W{ ' n(x)n(xnW er ) 

■J2 E HxitWerjliWxi-XjW^r} 



with compensator (in the unconditional case) 
\W\E Xi exH\\ u -*i\\<r} 



CK w {r) 



w er (n(x) + l)(n(xnWer) + l) 



• A^(u,x°|x + ) du. 
The restriction estimator is 

|tu°| 

Kw ° (r) = n(x°)n(xnW° r ) 

• ]T Hxi£W° r }I{\\xi-xj\\<r} 

and the compensator is 

\W°\^2 Xiex ol{\\u- Xj \\<r} 



CK W o (r) 



wlr ( n (x») + i)( n (xn^) + i) 

• Ag(«,x°|x + ) du. 



Typically, W° = W q r, so Wq t is equal to W e r R+r y 
The reweighting estimator is 



|W| 



n(x)n(x° n W er ) 

■ E Hxi^We^IiWxi-XjW^r} 



and the compensator is 



I^IE, jex i{h-^ll<^} 

r«, nW r er (n(x) + l)(n(x°nW Q r) + l) 
■ \n(u,x°\x + ) du. 



Usually W° = W eR , so JU° n W Qr is equal to 
Wq m ax(R,r) • From this we conclude that when using 
border correction we should always use the reweight- 
ing estimator since the restriction estimator discards 
additional information and neither the implementa- 
tion nor the interpretation is easier. 

APPENDIX E: MODIFIED EDGE 
CORRECTIONS FOR NEAREST NEIGHBOR 
FUNCTION G 

E.l Hanisch Estimators 

Hanisch [32] considered estimators for G(r) of the 
form Gw( r ) = -Dx(^)/p, where p is some estimator 
of the intensity p, and 

Hx l £W ed Jl{d i <r} 



(74) L> x (r) 



E 



where di = d(xi, x\ {xi}) is the nearest neighbor dis- 
tance for Xi. If p were replaced by p, then Gw{f) 
would be an unbiased, Horvitz-Thompson estima- 
tor of G(r). See [71], pages 128-129, [9]. Hanisch's 
recommended estimator D4 is the one in which p is 
taken to be 



Ac(oo) 



E 



i{ xi ew edi } 



\w, 



edi 



This is sensible because -D x (oo) is an unbiased esti- 
mator of p and is positively correlated with Z? x (r). 
The resulting estimator Gy/{r) can be decomposed 
in the form (67) where 

I{ueW ed(u ^}I{d(u,x)<r} 
sw{u,x,r) = 

Acu{ u }( TO Wed(u,x)l 

for u ^ x, where d(it,x) is the ("empty space") dis- 
tance from location u to the nearest point of x. 
Hence, the corresponding compensator is 



CG w (r) 



l{ueW ed{u ^}I{d(u,x)<r} 
w AcU{«}(c»)|Wed(u,x)l 

' A/j(u,x) du. 



This is difficult to evaluate, since the denominator 
of the integrand involves a summation over all data 
points: -D x u{m}(°°) i s n °t related in a simple way to 
-D x (oo). Instead, we choose p to be the conventional 
estimator p = n(x)/\W\. Then 



G w (r) 



\W\ 
n(x) 
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which can be decomposed in the form (67) with 
\W\ I{ueW ed{u ^}I{d(u,x)<r} 



s w {u,x,r)- 



n(x) + l 



\W ( 



Qd(u,x)\ 



E.2 Border Correction Estimator 

The classical border correction estimate of G is 
1 



G W (r) 



for u ^ x, so that the compensator is 
CG w (r) 

I{ueW ed{U)X) }I{d\ 



(78) 



fi(xfl%) 

• H x i e W er }I{d(x u x_i) < r} 



n(x) + 1 J w 



{u,x) I 



with compensator (in the unconditional case) 

1 



Ag(u,x) du. 



CG w (r) 



In the restriction estimator we exclude the bound- 
ary points and take d° = d(xi,x c i i ), effectively re- 
placing the data set x by its restriction x° = x n W° : 



(79) 



1 + n(x n W Qr ) 

I{d(u,x) < r}Xx(u,x) du. 



G W o(r) 



\W°\ 
n(x°) 



E 

Xi€x° 



I{ Xi eW° d0 }I{d°<r} 



In the conditional case, the Papangelou conditional 
intensity \z(u,x) must be replaced by Ag(u,x°|x + ) 
given in (24). The restriction estimator is obtained 
by replacing W by W° and x by x° in (78)-(79), 



The compensator is (75) but computed for the point yielding 
pattern x° in the window W°: 



G W o(r) 



1 



CGw°{f) 



\W°\ 



n(x°) + 1 J w 



I{"€W| d(ll|XO) }I{d(u 1 x°)<r} 
• A^(m,x°|x + ) dn. 



n(xnW° r ) 

• Hxi£W° r }I{d( Xl , X %)<r}, 



CG W o(r) 



l + n(xnW| P ) 

l{d(u,x°) <r}A«(it,x°|x + )du. 



ir? 



In the usual case W° = Wqr, we have W^ d = 
w e(R+d) ■ 

In the reweiqhtinq estimator we take gL = d(xi,x\ _ . „ , m , 

r i \ rp * „ ..mi. 4. Typically, F = W^ji so that W2- = W G (R+r ). The 

To retain the Horvitz-Thompson property, . ,, ,. , . ... 



we must replace the weights l/|WgdJ in (74) by 
1/IT4 70 D WgdJ. Thus, the modified statistics are 



(76) G w °,w{r) 



\W\ 
n(x) 



E 



I{xi£W edi }I{di<r} G w °,w(r 

\W°nw edi \ 



reweighting estimator is obtained by restricting x% 
and u in (78)-(79) to lie in W°, yielding 

1 



and 



n(x° n W Qr ) 

■ H x i£ Wer}l{d{xi,x_i)<r}, 



Xi£X° 



(77) 



CGw°,w(r) 
\W\ 



CG w °,w(r) 



1 



n(x) + 1 J W o 



I{«€W ed(UiX) }I{d(u,x)<r} 



1 + n(x° n W er 



_i) 

\w°n Wc 



Qd(u,x)\ 



■J 



W°nW P 



Ag(u,x°|x + ) du. 



I{d(it,x) <r} 
■ \§(u, x°|x + ) du. 



\\\ 



Odi 



w ( 



In the usual case where W° = W eR , we have W° n In the usual case where W o = Wqr ^ we w r n 

emax(R,d,)- W Qr = WQ m3X m r y Again, the reweighting approach 

is preferable to the restriction approach. 

The border corrected estimator G(r) has relatively 
poor performance and sample properties [38], pa- 



Optionally, we may also replace |W|/n(x) in (76) 
by I W°\/n(x n W°), and, correspondingly, replace 
J§l_in (77) by \W°\/(n(xnW°) + 1). 
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ge 209. Its main advantage is its computational ef- 
ficiency in large data sets. Similar considerations 
should apply to its compensator. 
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